Models and Training for Unsupervised Preposition Sense Disambiguation
نویسندگان
چکیده
We present a preliminary study on unsupervised preposition sense disambiguation (PSD), comparing different models and training techniques (EM, MAP-EM with L0 norm, Bayesian inference using Gibbs sampling). To our knowledge, this is the first attempt at unsupervised preposition sense disambiguation. Our best accuracy reaches 56%, a significant improvement (at p <.001) of 16% over the most-frequent-sense baseline.
منابع مشابه
Web-Scale N-gram Models for Lexical Disambiguation
Web-scale data has been used in a diverse range of language research. Most of this research has used web counts for only short, fixed spans of context. We present a unified view of using web counts for lexical disambiguation. Unlike previous approaches, our supervised and unsupervised systems combine information from multiple and overlapping segments of context. On the tasks of preposition sele...
متن کاملA Flexible Unsupervised PP-Attachment Method Using Semantic Information
In this paper we revisit the classical NLP problem of prepositional phrase attachment (PPattachment). Given the pattern V −NP1−P−NP2 in the text, where V is verb,NP1 is a noun phrase, P is the preposition and NP2 is the other noun phrase, the question asked is where does P −NP2 attach: V or NP1? This question is typically answered using both the word and the world knowledge. Word Sense Disambig...
متن کاملWeb-Based Model for Disambiguation of Prepositional Phrase Usage
We explore some Web-based methods to differentiate strings of words corresponding to Spanish prepositional phrases that can perform either as a regular prepositional phrase or as idiomatic prepositional phrase. The type of these Spanish prepositional phrases is preposition–nominal phrase–preposition (P−NP−P), for example: por medio de ‘by means of’, a fin de ‘in order to’, con respecto a ‘with ...
متن کاملMELB-YB: Preposition Sense Disambiguation Using Rich Semantic Features
This paper describes a maxent-based preposition sense disambiguation system entry to the preposition sense disambiguation task of the SemEval 2007. This system uses a wide variety of semantic and syntactic features to perform the disambiguation task and achieves a precision of 69.3% over the test data.
متن کاملDomain Specific Sense Disambiguation with Unsupervised Methods
Most approaches in sense disambiguation have been restricted to supervised training over manually annotated, non-technical, English corpora. Application to a new language or technical domain requires extensive manual annotation of appropriate training corpora. As this is both expensive and inefficient, unsupervised methods are to be preferred, specifically in technical domains such as medicine....
متن کامل